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3. [] This express request to begin national examination procedures (35 U.S.C. §371(f)) at any time rather than delay 
examination until the expiration of the applicable time limit set in 35 U.S.C. §371(b) and PCT Articles 22 and 39(1). 
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Claims 



Number Filed 



Number Extra 



Rate 



Total Claims 



-20 • 



X $18.00 



Independent Claims 



-3 = 



X $80.00 



Multiple dependent clatm(s) (if applicable) 



+ $270.00 
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TOTAL NATIONAL FEE = 



$860.00 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
In re application of '■ 
Seishi KATO et al. Attn: BOXPCT 

Serial No. NEW Docket No. 2001_1023A 

Filed July 20, 2001 : 

A HUMAN NUCLEAR PROTEIN HAVING A WW 
DOMAIN AND A POLYNUCLEOTIDE ENCODING 
THE PROTEIN 

[Corresponding to PCT/JP00/08253 
Filed November 22, 2000] 

PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents, 
Washington, DC 20231 

Sir: 

Prior to calculating the filing fee, please amend the above-identified application as 
follows: 

IN THE SPECIFICATION 
Page 1, immediately after the title, please insert: 
This application is a 371 of PCT/JPOO/08253 filed November 22, 2000. 

IN THE CLAIMS 
Please amend the claims as follows: 

5. (Amended) An expression vector expressing the polynucleotide of claim 2 in in vitro 
translation or in host cells. 



ATTACHMENT D 



protein of claim 1, and which polynucleotide comprises the nucleotide sequence of SEQ ID NO. 
2. 

Please add the following new claims: 

8. An expression vector expressing the polynucleotide of claim 3 in in vitro 
translation or in host cells. 

9. A transformed cell producing the human nuclear protein of claim 1, which is a cell 
transformed with an expression vector which expresses a polynucleotide encoding the protein of 
claim 1, and which polynucleotide consists of the nucleotide sequence of SEQ ID NO. 2. 



REMARKS 



The specification has been amended to reflect the 371 status. In addition, the claims have 
been amended to remove the multiple dependencies to reduce the PTO filing fee. 

Attached hereto is a marked-up version of the changes made to the specification and 
claims by the current amendment. The attached pages are captioned " Version with markings to 
show changes made ". 

Favorable action on the merits is solicited. 



WMC/dlk 

Washington, D.C. 20006-1021 
Telephone (202) 721-8200 
Facsimile (202) 721-8250 
July 20, 2001 



Respectfully submitted, 



Seishi KATO et al. 




Warren M. Cheek, Jr.f/ 
Registration No. 33,367 
Attorney for Applicants 



-3 - 



FROM 5HMi 200 1* ■ 7fll 7B (*) 1 5 : 40/Mil 5 : 35/JWS4503499058 P 18 

Version with Markings to 

1 Show cfcwg Made 

CLAIMS 



1. An isolated and punned human nuclear protein comprising the amino 
acid sequence of SEQ ID NO: 1. 

5 

2. A polynucleotide encoding the protein of claim 1, which comprises the 
nucleotide sequence of SEQ ID NO: 2. 

3. The polynucleotide of claim. 2, consisting of the nucleotide sequence of 
10 SEQ ID NO: 2. 

4. A human genomic DNA fragment with which a polynucleotide of SEQ ID 
NO:3 or a partial contiguous sequence thereof hyhridi.zes under stringent 
conditions. 



U 15 



St An expression vector expressing the polynucleotide of claim 2 in in 
vitro translation >or in host cells. 




A transformed cell producing the human nuclear protein of claim 1, 
7. / An antibody against the human nuclear protein of claim 1. 
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THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re application of 
Seishi KATO et al. 
Serial No. 09/889,722 
Filed July 20, 2001 

A HUMAN NUCLEAR PROTEIN HAVING A 
WW DOMAIN AND A POLYNUCLEOTIDE 
ENCODING THE PROTEIN 
[Corresponding to PCT/JP00/08253 
Filed November 22, 2000] 




Attn: BOXPCT 
Docket No. 2001 1023A 



AMENDMENT AND REPLY TO NOTIFICATION OF MISSING REQUI REMENTS 

UNDER 35 USC 371 

Assistant Commissioner for Patents, 
Washington, DC 20231 

Sir: 

In response to the PTO Notification of Missing Requirements Under 35 USC 371 dated September 
10, 2001, submitted herewith is a Declaration for the above application executed by the inventors. 

Enclosed is a paper and computer readable copy of the Sequence Listing. Please replace the 
Sequence Listing originally filed with the attached substitute Sequence Listing. No new matter is added. 

Also enclosed are the PTO surcharge of $130.00 required by 37 CFR 1 .492(e), and a copy of the 
PTO notice. 

It is respectfully submitted that the application is now complete, and early indication thereof is now 
requested. 

Respectfully submitted, 
Seishi KATO et al. 



By. 




Warren M. Cheek, Jr. 
Registration No. 33,3 
Attorney for Applicants 



WMC/dlk 

Washington, D.C. 20006-1021 
Telephone (202) 721-8200 
Facsimile (202) 721-8250 
October 19, 2001 
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DESCRIPTION 



A Human Nuclear Protein having a WW Domain and 
A Polynucleotide encoding the Protein 

%<S appiiuMsn h a til of PcrlsTPoojo^i ^kd Mmhw 'tend. 

Technical Field 



The present invention relates to a novel protein having a WW domain 
10 and existing in human cell nuclei, a polynucleotide encoding this protein, and 
an antibody against this protein. The protein and antibody of the present 
invention are useful for diagnosis and therapy of various diseases, and the 
polynucleotide of the present invention is useful as a probe for genetic diagnosis 
or as a genetic source for gene therapy. Further, the polynucleotide can be 
15 used as a genetic source for large-scale production of the protein of this 
invention. 

Background Art 

20 

The term "nuclear protein" is a generic name of proteins functioning in 
cell nucleus. In nucleus there are genomic ON A serving as a plan of organism, 
and nuclear proteins are involved in replication, transcriptional regulation etc. 
of these genomic DNA. Typical nuclear proteins whose functions have been 

25 revealed include a transcription factor, a splicing factor, an intranuclear 
receptor, a cell cycle regulator and a tumor suppressor. These factors are 
closely related not only to life phenomena such as development and 
differentiation but also to diseases such as cancers (New Medical Science, 
"Tensha No Shikumi To Shikkan" (Mechanism of Transcription and Diseases) ed, 

30 by Masahiro Muramatsu). Accordingly, these nuclear proteins are expected as 
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an antibody against this protein. The protein and antibody of the present 
invention are useful for diagnosis and therapy of various diseases, and the 
polynucleotide of the present invention is useful as a probe for genetic diagnosis 
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15 used as a genetic source for large-scale production of the protein of this 
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Background Art 

20 

The term "nuclear protein" is a generic name of proteins functioning in 
cell nucleus. In nucleus there are genomic DNA serving as a plan of organism, 
and nuclear proteins are involved in replication, transcriptional regulation etc. 
of these genomic DNA. Typical nuclear proteins whose functions have been 

25 revealed include a transcription factor, a splicing factor, an intranuclear 

receptor, a cell cycle regulator and a tumor suppressor. These factors are r 
closely related not only to life phenomena such as development and 
differentiation but also to diseases such as cancers (New Medical Science, 
"Tensha No Shikumi To Shikkan" (Mechanism bf Transcriptior^hd Diseases) ed. 

30 by Masahiro Muramatsu). Accordingly,, these nuclear proteins are expected a,s 
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target proteins for developing low-molecular pharmaceutical preparations that 
regulate transcription and translation of specific genes, and it is desired to 
obtain as many nuclear proteins as possible. 



5 The WW domain belongs to a new family of protein-protein interaction 

motifs resembling SH2, SH3, PH and PTB domains. It is known that this 
domain consists of about 40 amino acid residues containing 2 conserved 
tryptophan residues, and like the SH3 domain, binds to a proline-rich amino 
acid sequence (H. I. Chen and M. SudoL, Proc. Natl, Sci 92, 7819-7823, 1995). 

10 As a result of X-ray crystallographic analysis of a WW domain/ligand conjugate, 
it was revealed that the three-dimensional structure of the WW domain is 
different from that of SH3 (M. J. Macias et al., Nature, 382, 646-649, 1996). 
Like other protein motifs, the WW domain is contained in the cytoskeieton 
system (P. Bork and M. Sudol TIBS, 19, 531-533, 1994), in proteins 

15 participating in the signal transduction system (H. I. Chen and M. SudoL, Proc. 
Natl. Sci., 92, 7819-7823, 1995), in a ubiquitin-protein ligase in the protein 
degradation system (O. Staub et al., EMBO J., 15, 2371-2380, 1996) and in a 
transcription activator (P. Bork and M. Sudol, TIBS, 19, 531-533, 1994), and is 
believed to play an important role in the intracellular signal transduction 

20 system. 



25 



The object of the present invention is to provide a novel protein present 
in human cell nucleus, a polynucleotide encoding this protein, and an antibody 
against this nuclear protein. 



Disclosure of Invention 



30 



To achieve the object described above, the present application provides 
the following inventions (1) to (7): 
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(1) An isolated and purified human nuclear protein comprising the amino 
acid sequence of SEQ ID NO: 1. 

5 (2) A polynucleotide encoding the protein of the invention (1), which 
comprises the nucleotide sequence of SEQ ID NO: 2. 

(3) The polynucleotide of the invention (2), consisting of the nucleotide 
sequence of SEQ ID NO: 2. 

10 

(4) A human genomic DNA fragment with which a polynucleotide of SEQ ID 
NO:3 or a partial contiguous sequence thereof hybridizes under stringent 
conditions, 

15 (5) An expression vector expressing the polynucleotide of the invention (2) 
or (3) in in vitro translation or in host cells. 

(6) A transformed cell producing the human nuclear protein of the 
invention (1), which is transformant with the expression vector of the invention 

20 (5). 

(7) An antibody against the human nuclear protein of the invention (1). 

25 Best Mode for Carrying Out the Invention 

The protein of the invention (1) can be obtained by a method of isolation 
thereof from human organs, cell lines etc, by a method of preparing the peptide 
through chemical synthesis on the basis of the amino acid sequence set forth in 
30 SEQ ID NO: 1 or by a method of production thereof by recombinant DNA 
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technique using the polynucleotide encoding the amino acid sequence of SEQ 
ID NO: 1, among which the method with recombinant DNA technique is 
preferably used. For example, a vector harboring the polynucleotide of the 
invention (2) or (3) is subjected to in vitro transcription to prepare RNA which is 
5 then used as a template in in vitro translation, whereby the protein can be 
expressed in vitro. Further, by integrating the polynucleotide in a conventional 
method into a suitable expression vector, the protein encoded by the 

% polynucleotide can be expressed in a large amount in procaryotes such as E. 

5f coli, Bacillus subtilis etc. or eucaryotes such as yeasts, insect cells and 

J3 10 mammalian cells. 



To produce the protein of the invention (1) by expressing the DNA 
through in vitro translation, the polynucleotide of the invention (2) or (3) is 
integrated in a vector harboring an RNA polymerase promoter (the invention (5)) 

15 and added the vector to an in vitro translation system such as a rabbit 
reticulocyte lysate or a wheat germ extract containing an RNA polymerase 
compatible with said promoter, whereby the protein of the invention (1) can be 
produced in vitro. The RNA polymerase promoter includes e.g. T7, T3 and SPG. 
The vector harboring such RNA polymerase promoter includes e.g. pKAl, 

20 pCDM8, pT3/T7 18, pT7/3 19, and pBluescript II. 



To produce the protein of the invention (1) by expressing the DNA in 
microorganisms such as E. coli, the polynucleotide of the invention (2) or (3) is 
integrated in an expression vector harboring an origin capable of replication in 

25 microorganisms, a promoter, a ribosome-binding site, a DNA cloning site, a 
terminator etc. to prepare the expression vector (the invention (5)) which is then 
used for transformation of host cells, and by culturing the resulting 
transformant (the invention (6)), the protein encoded by said polynucleotide can 
be produced in a large amount in the microorganism. If an initiation codon 

30 and a termination codon have been added respectively to sites upstream and 
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if 

downstream from an arbitrary translated region in said expression vector, a 
protein fragment containing the arbitrary region can be obtained by expressing 
the DNA. Alternatively, it can also be expressed as a fusion protein with 
another protein. By cleaving this fusion protein with a suitable protease, the 
5 part of only the protein encoded by said polynucleotide can be obtained. The 
expression vector for E. coli includes e.g. pUC series vectors, pBluescript II, pET 
expression system vectors and pGEX expression system vectors. 

£1! To produce the protein of the invention (1) by expressing the DNA in 

10 eucaryotes, the translated region of the polynucleotide of the invention (2) or (3) 
^ is integrated in an eucaiyotic expression vector harboring a promoter, a splicing 

PJ region, a poly (A) -additional site etc. to prepare the expression vector (the 

|& invention (5)) which is then used for transforming eucaryotic cells (the invention 

yf (6)), whereby the protein of the invention (1) can be produced in the eucaryotic 

£ 15 cells. The expression vector includes e.g. pKAI, pCDM8, pSVK3, pMSG, pSVL, 
U pBK-CMV, pBK-RSV, EBV vector, pRS and pYES2. If vectors such as 

pIND/V5-His ? pFLAG-CMV-2, pEGFP-Nl and pEGFP-Cl are used, the protein 
of the present invention can also be expressed as a fusion protein having 
various tags such as His tag, FLAG tag and GFP added thereto. As the 
20 eucaryotic cells, mammalian cultured cells such as simian renal cells COS7 and 
Chinese hamster ovary cells CHO, budding yeasts, fission yeasts, silkworm cells 
and Xenopus oocytes are generally used, but insofar as the protein of the 
invention (1) can be expressed, any eucaryotic cells can be used. For 
introducing the expression vector into eucaryotic cells, conventional methods 
25 such as the electroporation method, calcium phosphate method, liposome 
method and DEAE-dextran method can be used. 

For isolating and purifying the protein of the invention (1) from a 
culture after expression of the desired protein in the procaryotic or eucaryotic 
30 cells, separation techniques known in the art can be used in combination. 



6 



Such techniques include e.g. treatment with a denaturant such as urea or a 
surfactant, sonication, enzymatic digestion, salting-out or solvent precipitation, 
dialysis, centrifugation, ultrafiltration, gel filtration, SDS-PAGE, isoelectric 
focusing, ion-exchange chromatography, hydrophobic chromatography, affinity 
5 chromatography and reverse phase chromatography. 

The protein of the invention (1) encompasses peptide fragments (each 
m consisting of 5 or more amino acid residues) containing any partial amino acid 

sequence from the SEQ ID NO: 1. Such a peptide fragment can be used as an 
CO 10 antigen for preparing the antibody of the present invention. Further, the 
Q protein of the invention (1) encompasses fusion proteins with another arbitrary 

^ protein. For example, fusion proteins with glutathione- S- transferase (GST) or 

? green fluorescent protein (GFP), described in the Examples, can be mentioned. 

15 The polynucleotide (cDNA) of the invention (2) or (3) can be cloned from 

^ a cDNA library derived from e.g. human cells. The cDNA is synthesized using 

poly(A) + RNA as a template extracted from human cells. The human cells may 
be either cultured cells or cells excised by an operation etc. from the human 
body. The cDNA can be synthesized by any methods such as the 

20 Okayama-Berg method (Okayama, H. and Berg, P., MoL Cell Biol., 2, 161-170, 
1982) and the Gubler-Hofiman method (Gubler, U. and Hoffman, J. Gene, 25, 
263-269, 1983), but for efficiently obtaining full-length clones, the Capping 
method (Kato, S. et al., Gene, 150, 243-250, 1994) described in the Examples is 
preferably used. 

25 

The polynucleotide of the invention (2) comprises the nucleotide 
sequence of SEQ ID NO: 2, and for example, the polynucleotide consisting of the 
nucleotide sequence of SEQ ID NO: 3 has a 2669-bp nucleotide sequence 
containing a 2115-bp open reading frame (ORF). This ORF encodes a protein 
30 consisting of 704 amino acid residues. The polynucleotide of the invention (3) 



7 



comprises the 2115-bp nucleotide sequence (SEQ ID NO:2) constituting this 
ORF. By expressing the cDNA of the invention (2) or (3) in E. coli or animal 
cultured cells, an about 80-kDa protein was obtained. This protein binds to a 
C-terminal domain of RNA polymerase II, so it is considered to participate in 
transcriptional regulation. 

Since the protein of the invention (1) is expressed in any tissues, the 
same clone as the polynucleotide of the invention (2) or (3) can be easily 
obtained from a human cDNA library prepared from human cells by screening 
the library with an oligonucleotide probe synthesized on the basis of the 
nucleotide sequence of the polynucleotide set forth in SEQ ID NO: 2 or 3. 
Alternatively, the objective cDNA can also be synthesized by polymerase chain 
reaction (PCR) by use of such oligonucleotides as primers. 

Generally, polymorphism of human genes occurs frequently due to 
individual variations. Accordingly, those polynucleotides where in SEQ ID NO: 
2 or 3, one or more nucleotides have been added, deleted and/ or substituted 
with other nucleotides fall under the scope of the invention (3) or (4). 

Accordingly, those proteins where in SEQ ID NO: 1, one or more amino 
acids have been added, deleted and/ or substituted with other amino acids as a 
result of such alterations to nucleotides also fall under the scope of the 
invention (1) insofar as they have the activity of a protein having the amino acid 
sequence of SEQ ID NO: 1. 

The polynucleotide of the invention (2) or (3) encompasses DNA 
fragments (10 bp or more) containing any partial nucleotide sequence from the 
sequence of SEQ ID NO: 2 or 3. Further, DNA fragments consisting of a sense 
or antisense strand thereof fall under the scope of this invention. These DNA 
fragments can be used as probes for genetic diagnosis. 
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The invention (4) is concerned with a human genomic DNA fragment 
with which the polynucleotide of SEQ ID NO: 3 or a partial contiguous sequence 
thereof hybridizes under stringent conditions. As used herein, the stringent 
5 conditions are that enables specific and detectable binding between the 
polynucleotide of SEQ ID NO: 3 or a partial contiguous sequence thereof (30 bp 
or more) and chromosome-derived genomic DNA. The stringent conditions are 
defined in terms of salt concentration, organic solvent (e.g., formamide), 
temperature and other known conditions. That is, stringency is increased by a 
10 decrease in salt concentration, by an increase in organic solvent concentration, 
or by an increase in hybridization temperature. For example, the stringent salt 
concentration is usually about 750 mM or less NaCl and about 75 mM or less 
trisodium citrate, more preferably about 500 mM or less NaCl and about 50 mM 
or less trisodium citrate and most preferably about 250 mM or less NaCl and 
15 about 25 mM or less trisodium citrate. The stringent organic solvent 
concentration is about 35 % or more formamide, most preferably about 50 % or 
more formamide. The stringent temperature condition is about 30 °C or more, 
more preferably about 37 °C or more and most preferably about 42 °C or more. 
The other conditions include hybridization time, the concentration of a 
20 detergent (e.g. SDS), the presence or absence of carrier DNA, etc., and by 
combining these conditions, varying stringency can be established. Further, 
the conditions for washing after hybridization also affects stringency. The 
washing conditions are also defined in terms of salt concentration and 
temperature, and the stringency of washing is increased by a decrease in salt 
25 concentration or by an increase in temperature. For example, the stringent 
salt condition for washing is about 30 mM or less NaCl and about 3 mM or less 
trisodium citrate, most preferably about 15 mM or less NaCl and about 1.5 mM 
or less trisodium citrate. The stringent temperature condition for washing is 
about 25 °C or more, more preferably about 42 °C or more and most preferably 
30 about 68 °C or more. The genomic DNA fragment of the invention (4) can be 
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isolated for example by subjecting a genome library prepared from human 
chromosomal DNA to screening by the above stringent hybridization with said 
polynucleotide as a probe and subsequent washing. 

The genomic DNA fragment of the invention (4) comprises 
expression-regulating regions (promoter/ enhancer and suppressor sequences, 
etc.) for the region coding for the protein of the invention (1). These 
expression-regulating regions are useful as a material for screening a material 
regulating in vivo expression of the protein of the invention (1). 



The antibody of the invention (7) can be obtained from serum in an 
animal immunized with the protein of the invention (1) as an antigen. The 
antigen used may be a peptide chemically synthesized on the basis of the amino 
acid sequence of SEQ ID NO: 1 or the protein expressed in the eucaryotic or 

15 procaryotic cells. Alternatively, the antibody can be prepared by introducing 
the above-described expression vector for eucaryotic cells through an injection 
or a gene gun into animal muscles or skin and then collecting serum {e.g., an 
invention in JP-7-313187A). As the animal, a mouse, rat, rabbit, goat, chicken 
or the like is used. If a hybridoma is produced by fusing myeloma cells with B 

20 cells collected from the spleen in the immunized animal, a monoclonal antibody 
against the protein of the invention (1) can be produced by the hybiidoma. 



Examples 



The present invention will be described in more detail by reference to 
the Examples, which however are not intended to limit the scope of the present 
invention. Basic procedures for DNA recombination and enzymatic reaction 
were in accordance with those described in a literature (Molecular Cloning, A 
30 Laboratory Manual, Cold Spring Harbor Laboratory, 1989). Unless otherwise 



10 

specified, the restriction enzymes and various modifying enzymes used were 
products of Takara Shuzo Co., Ltd. The buffer composition in each enzymatic 
reaction, as well as reaction conditions, was followed instructions attached to 
the kits. Synthesis of cDNA was conducted according to a literature (Kato, S. 
et aL, Gene, 150, 243-250, 1994). 

(i) cDNA cloning 

As a result of large-scale determination of the nucleotide sequences of 
cDNA clones selected from a human full-length cDNA library (described in 
WO97/03190), clone HP03494 was obtained. This clone had a structure made 
of a 291-bp 5 -untranslated region, a 2115-bp ORF and a 263-bp 
3 '-untranslated region (SEQ ID NO: 3). The ORF encodes a protein consisting 
of 704 amino acid residues. 

Using the amino acid sequence (SEQ ID NO: 1) of this protein, a protein 
database was searched, but none of known proteins had homology to this 
protein. Further examination of GenBank by using the nucleotide sequence of 
its cDNA indicated that some ESTs (e.g. Accession No. A1758365) have 90 % or 
more homology thereto, but they are partial sequences, so whether or not they 
code for the same protein as the protein of this invention cannot be judged. 

Examination of motif sequences indicated that as shown in Table 1, the 
region of from the 43- to 78-positions has homology to WW domains. 
Tryptophan residues at the 49- and 72-positions and a proline residue at the 
75-position are amino acid residues conserved in every known WW domain. 
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Table 1 



Protein Position Amino Acid Sequence Accession No. 



Conserved Sequence W G — YY-N W— P- 



HP03494 


43 


ELVHAG^C^SRRENRPYYFNRFTNQSLWEMPVLQOHD 




Npw38 


46 


EGLPPSVm<M^P$CGLPYYW^mVSWLSPHDPNSV 


BAA76400 


YapJ-iuman 


171 


VPL p AGWEMAKTSS. GQRYFLNH 1 DQTTTTODPRKANLS 


P46937 


Yap_Chi ck-1 


169 


VPLPPGWEMAKTPS. GQRYFLNH I CX3TTTWQ0PRKMLS 


P46936 


Yap_Mouse-1 


156 


VPLPAGWEMAKTSS . GQRYFLNHNDGTTTWGDPRKAALS 


P46938 


N ed4_Mou s e— 1 


40 


SPLPPGWEERQDVL. GRTYYVNHESRRTQWKRPSPDDOL 


P46935 


Ned4_Huna rH 


213 


SPLPPGWEERQD1L GRTYYVNHESRRTQWKRPTPQDNL 


P46934 


Ned4_Mouse~2 


196 


SGLPPGWEEKQDOR. GRSYYVDHNSKTTTWSKPTMQDOP 


P46935 


Ned4_Hunan-2 


375 


SGLPPGWEEKQDER. GRSYYMDHNSRTTTWTKPTVQATV 


P46934 


Dmd_Hunan 


3055 


TSVQGPWERAISPN. KVPYY f NHETQTTCWDHPKMTELY 


P11532 


Dmc_Mouse 


3048 


TSVQGPWERAISPN, KVPYY 1 NHEJOTTCWDHPKMTELY 


P1 1531 


FE65_Rat 


42 


SDLPAGWM3VQDTS. GTTYWHI . PTGTTQWEPPGRASPS 


P46933 


Msbl /Hunan 


249 


! VLPPNWKTARDPE GK i YYYHV 1 TRGTQWDPP7WESPG 




i QGA_Hunan 


679 


GDNNSKWVKHWVKG. GYYYYHNLETQEGGWDEPPNFVQN 


P46940 


FBP11-1_Mouse 


1 


WTEHKSPO. GRTYYWETKQSTlrVEKPDDLKTP 


U40747 


FBP11-2 Mouse 


36 


LLSKCPWKTYKSDS. GKPYYYNSGTKESRWAKP 


U40747 



(ii) Northern blotting 

Multi tissue Northern Blot (Clontech) having human tissue poly{A)"RNA 
blotted thereon was used as an mRNA source. As the probe, an EcolRI-NofL 
fragment of full-length HP03494 cDNA, labeled with a radioisotope by a random 
primer labeling kit (Pharmacia) , was used. The conditions for Northern blotting 
hybridization followed the protocol attached to the kit. An about 3-kb 
hybridization band was obtained from the heart, brain, placenta, lung, liver, 
skeletal muscle, kidney, pancreas, spleen, thymus, prostate, testicle, ovary, 
small intestine, colon and peripheral blood, suggesting that this protein is a 
housekeeping one. 



(iii) Protein synthesis by in vitro translation 

A plasmid vector harboring the polynucleotide (cDNA) of this invention 
was used to perform in vitro transcription/ translation by a TnT rabbit 
reticulocyte lysate kit (a product of Promega). The expression product was 
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labeled with a radioisotope by adding [ 35 S] methionine. Any reaction was 
conducted according to the protocol attached to the kit. 2 jig of the plasmid 
was reacted at 30 °C for 90 minutes in a 25 jul reaction solution containing 12.5 
jil T N T rabbit reticulocyte lysate, 0.5 pi buffer (attached to the kit), 2 fil amino 
5 acid mixture (not containing methionine), 2 jul (0.35 MBq/jal) of [ 35 S] methionine 
(Amersham), 0.5 pi of T7 RNA polymerase and 20 U of RNasin. Then, 2 jul SDS 
sampling buffer (125 mM Tris-HCl, pH 6.8, 120 mM 2-mercaptoethanol, 2 % 
SDS solution, 0.025 % bromophenol blue, 20 % glycerol) was added to 3 jil of 
the reaction solution, and the mixture was treated by heating at 95 °C for 3 
10 minutes and subjected to SDS-polyacrylamide gel electrophoresis. By 
autoradiography, the molecular weight of the translated product was 
determined. As a result, the translation product, which had a molecular 
weight of 80 kDa almost similar to the molecular weight (80,618) deduced from 
the ORF, was formed. 



;:: is 

o 



(iv) Expression of GST fusion protein in E. coli 

The translated region was amplified by PCR where pHP03494 was used 
as a template while a 26-mer sense primer (SEQ ID NO: 4) starting at a 
translation initiation codon and having an EcoRl recognition site added thereto 
20 and a 26-mer antisense primer (SEQ ID NO: 5) terminating at a termination 
codon having a SaR recognition site added thereto were used respectively as 
primers. The PCR product was digested with restriction enzyme EcoRI and 
inserted into EcoKL site in vector pGEX-5X-l (Pharmacia). After its nucleotide 
sequence was confirmed, the resulting plasmid was used for transforming E. coli 
25 BL21. The transformant was cultured at 37 °C for 5 hours in LB medium, and 
IPTG was added thereto at a final concentration of 0.4 mM, followed by 
culturing at 37 °C for 2.5 hours. The microorganism was separated by 
centrifugation and lysed in a lysing solution (50 mM Tris-HCl (pH 7.5), 1 mM 
EDTA-1 % Triton X-100, 0.2 % SDS, 0.2 mM PMSF), frozen once at -80 °C, 
30 thawed, and disrupted by sonication. After centrifugation at 1000 x g for 30 



minutes, glutathione Sepharose 4B was added to the supernatant and 
incubated at 4 °C for 1 hour. After the beads were sufficiently washed, a 
fusion protein was eluted with an eluent (10 mM Tris-50 mM glutathione). As 
a result, a GST-HP03494 fusion protein having a molecular weight of about 110 
kDa was obtained. 

(v) Preparation of antibody 

Domestic rabbits were immunized with the above fusion protein as the 
antigen to give antiserum. First, an antiserum fraction precipitating by 40 % 
saturation with ammonium sulfate was applied onto a GST affinity column to 
remove GST antibody. Then, the unadsorbed fraction was purified by a 
GST-HP03494-antigen column. 

(vi) Western blotting 

A lysate of human fibrosarcoma cell line HT-1080 was separated by 
SDS-PAGE, blotted onto a PVDF membrane, blocked for 1 hour at room 
temperature with 0.05 % Tween 20-PBS (TPBS) containing 5 % skim milk, and 
incubated with the antibody diluted 10,000-fold with TPBS. The sample was 
washed 3 times with TPBS and then incubated for 1 hour with horseradish 
peroxidase-labeled goat anti-rabbit IgG diluted 10,000-fold with TPBS. The 
sample was washed four times with TPBS and detected by luminescence with 
an ECL reagent (Amersham), to give a signal with a molecular weight of 80 kDa. 
This molecular weight agreed with the molecular weight of the in uitro translated 
protein product in the rabbit cell-free translation system. 

(vii) Expression of GFP fusion protein 

The translated region was amplified by PCR where pHP03494 was used 
as a template while a 26-mer sense primer (SEQ ID NO: 4) starting at a 
translation initiation codon having an EcoRI recognition site added thereto and 
a 26-mer antisense primer (SEQ ID NO: 5) terminating at a termination codon 
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having a SaK recognition site added thereto were used respectively as primers. 
The PCR product was digested with restriction enzymes EcoRI and SaR and 
inserted into EcoRI site in GFP fusion protein expression vector pEGFP-C2 
(Clontech). After the nucleotide sequence was confirmed, HeLa cells were 
transfected by the lipofection method with the resulting plasmid 
pEGFP-C2-HP03494. Under a fluorescence microscope, the cells transfected 
with pEGFP-C2 showed fluorescence on the whole of the cells, whereas the cells 
transfected with pEGFP-C2-HP03494 showed fluorescence on their nuclei only. 
This result indicated that HP03494 is a protein present in nucleus. 



S (viii) Binding to a C-terminal domain (CTD) of RNA polymerase II 

fU The translated region coding for WW domain was amplified by PCR 

where pHP03494 was used as a template while a 33-mer sense primer (SEQ ID 
pi NO: 6) starting at a translation initiation codon with a BamHI recognition site 

ft 15 added thereto and a 33-mer antisense primer (SEQ ID NO: 7) terminating at a 
0 termination codon with an EcoRI recognition site added thereto were used 

respectively as primers. The PCR product was digested with restriction 
enzymes BamHI and EcoRI and then inserted into EamHI- EcoRI sites in vector 
pGEX-5X-l (Pharmacia). The resulting plasmid was subjected to expression in 
20 E. coli in the same manner as in (iv), to give a fusion protein GST-HP03494WW 
consisting of GST and HP03494 WW domain, and this fusion protein was 
separated by SDS-PAGE, then transferred onto a PVDF membrane, incubated 
with 32p-iabeled GST-CTD or ssp-iabeled GST-pCTD (GST-phosphorylated CTD) 
phosphorylated depending on a nuclear extract (Hirose, Y and Manley, J. L., 
25 Nature, 395, 93-96, 1998), and detected by the Far Western method (Kaelin, Jr. 
et al., Cell, 70, 351-364, 1992). It was revealed that the WW domain on 
HP03494 binds more strongly to phosphorylated CTD. This result suggested 
that the protein of this invention is involved in regulating transcription. 
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Industrial Applicability 

This invention provides an isolated and purified human nuclear protein 
existing in human cell nucleus, a polynucleotide (human cDNA and genomic 
DNA fragment) encoding this protein, and an antibody against this nuclear 
protein. The protein and antibody of this invention are useful for diagnosis 
and therapy of morbid states such as cancers. By use of the present 
polynucleotide, the present protein can be expressed in a large amount. By 
screening a low-molecular compound binding to the present protein, a new type 
of pharmaceutical preparation such as antitumor agent can be searched for. 



CLAIMS 



1. An isolated and purified human nuclear protein comprising the amino 
acid sequence of SEQ ID NO: 1. 

2. A polynucleotide encoding the protein of claim 1, which comprises the 
nucleotide sequence of SEQ ID NO: 2. 

3. The polynucleotide of claim 2 ? consisting of the nucleotide sequence of 
SEQ ID NO: 2. 

4. A human genomic DNA fragment with which a polynucleotide of SEQ ID 
NO:3 or a partial contiguous sequence thereof hybridizes under stringent 
conditions. 

5. An expression vector expressing the polynucleotide of claim 2 or 3 in in 
vitro translation or in host cells. 

6. A transformed cell producing the human nuclear protein of claim 1, 
which is transformants with the expression vector of claim 5. 

7. An antibody against the human nuclear protein of claim 1 . 
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09/889722 

JdBRec'dPCTTPTO 2 0 JUL 2001 

1/13 

SEQUENCE LISTING 

<110> Japan Science and Technology Corporation 

<120> Human nucleoprotein having a WW domain and 
a polynucleotide encoding the protein 

<130> 00-F-061PCT 

<140> PCT/ J POO/08253 
<141> 2000-11-22 

<150> JP1 1-332572 
<151> 1999-11-24 

<160> 7 

<170> Patent In Ver. 2.0 

<210> 1 
<211> 704 
<212> PRT 
<213> Homo sapiens 

<400> 1 

Met Ala Asn Glu Asn His Gly Ser Pro Arg Glu Glu Ala Ser Leu Leu 

15 10 15 

Ser His Ser Pro Gly Thr Ser Asn Gin Ser Gin Pro Cys Ser Pro Lys 

20 25 30 

Pro Me Arg Leu Val Gin Asp Leu Pro Glu Glu Leu Val His Ala Gly 
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35 40 45 

Trp Giu Lys Cys Trp Ser Arg Arg G! u Asn Arg Pro Tyr Tyr Phe Asn 

50 55 60 

Arg Phe Thr Asn Gin Ser Leu Trp Giu Met Pro Val Leu Gly Gin His 
65 70 75 80 

•Asp Val lie Ser Asp Pro Leu Gly Leu Asn Ala Thr Pro Leu Pro Gin 

85 90 95 

Asp Ser Ser Leu Val Giu Thr Pro Pro Ala Giu Asn Lys Pro Arg Lys 

100 105 110 

Arg Gin Leu Ser Giu Giu Gin Pro Ser Gly Asn Gly Val Lys Lys Pro 

115 120 125 

Lys lie Giu lie Pro Val Thr Pro Thr Gly Gin Ser Val Pro Ser Ser 

130 135 140 

Pro Ser lie Pro Gly Thr Pro Thr Leu Lys Met Trp Gly Thr Ser Pro 
145 150 155 160 

Giu Asp Lys Gin Gin Ala Ala Leu Leu Arg Pro Thr Giu Val Tyr Trp 

165 170 175 

Asp Leu Asp lie Gin Thr Asn Ala Val lie Lys His Arg Gly Pro Ser 

180 185 190 

Giu Val Leu Pro Pro His Pro Giu Val Giu Leu Leu Arg Ser Gin Leu 

195 200 205 

lie Leu Lys Leu Arg Gin His Tyr Arg Giu Leu Cys Gin Gin Arg Giu 

210 215 220 

Gly lie Giu Pro Pro Arg Giu Ser Phe Asn Arg Trp Met Leu Giu Arg 
225 230 235 240 

Lys Val Val Asp Lys Gly Ser Asp Pro Leu Leu Pro Ser Asn Cys Giu 

245 250 255 

Pro Val Val Ser Pro Ser Met Phe Arg Giu Me Met Asn Asp lie Pro 
260 265 270 
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Me Arg Leu Ser Arg lie Lys Phe Arg Glu Glu Ala Lys Arg Leu Leu 

275 280 285 

Phe Lys Tyr Ala Glu Ala Ala Arg Arg Leu lie Glu Ser Arg Ser Ala 

290 295 300 

Ser Pro Asp Ser Arg Lys Val Val Lys Trp Asn Val Glu Asp Thr Phe 
'305 310 315 320 

Ser Trp Leu Arg Lys Asp His Ser Ala Ser Lys Glu Asp Tyr Met Asp 

325 330 335 

Arg Leu Glu His Leu Arg Arg Gin Cys Gly Pro His Vai Ser Ala Ala 

340 345 350 

Ala Lys Asp Ser Val Glu Gly lie Cys Ser Lys lie Tyr His lie Ser 

355 350 365 

Leu Glu Tyr Val Lys Arg lie Arg Glu Lys His Leu Ala Me Leu Lys 

370 375 380 

Glu Asn Asn Me Ser Glu Glu Val Glu Ala Pro Glu Val Glu Pro Arg 
385 390 395 400 

Leu Val Tyr Cys Tyr Pro Val Arg Leu Ala Val Ser Ala Pro Pro Met 

405 410 415 

Pro Ser Val Glu Met His Met Glu Asn Asn Val Val Cys Me Arg Tyr 

420 425 430 

Lys Gly Glu Met Val Lys Val Ser Arg Asn Tyr Phe Ser Lys Leu Trp 

435 440 445 

Leu Leu Tyr Arg Tyr Ser Cys Me Asp Asp Ser Ala Phe Glu Arg Phe 

450 455 460 

Leu Pro Arg Val Trp Cys Leu Leu Arg Arg Tyr Gin Met Met Phe Gly 
465 470 475 480 

Vai Gly Leu Tyr Glu Gly Thr Gly Leu Gin Gly Ser Leu Pro Val His 

485 490 495 

Val Phe Glu Ala Leu His Arg Leu Phe Gly Val Ser Phe Glu Cys Phe 
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500 505 510 

Ala Ser Pro Leu Asn Cys Tyr Phe Arg Gin Tyr Cys Ser Ala Phe Pro 

515 520 525 

Asp Thr Asp Gly Tyr Phe Gly Ser Arg Gly Pro Cys Leu Asp Phe Ala 

530 535 540 

Pro Leu Ser Gly Ser Phe Glu Ala Asn Pro Pro Phe Cys Glu Glu Leu 
545 550 555 560 

Met Asp Ala Met Val Ser His Phe Glu Arg Leu Leu Glu Ser Ser Pro 

565 570 575 

Glu Pro Leu Ser Phe Me Val Phe lie Pro Glu Trp Arg Glu Pro Pro 

580 585 590 

Thr Pro Ala Leu Thr Arg Met Glu Gin Ser Arg Phe Lys Arg His Gin 

595 600 605 

Leu lie Leu Pro Ala Phe Glu His Glu Tyr Arg Ser Gly Ser Gin His 

610 615 620 

lie Cys Lys Lys Glu Glu Met His Tyr Lys Ala Val His Asn Thr Ala 
625 630 635 640 

Val Leu Phe Leu Gin Asn Asp Pro Gly Phe Ala Lys Trp Ala Pro Thr 

645 650 655 

Pro Glu Arg Leu Gin Glu Leu Ser Ala Ala Tyr Arg Gin Ser Gly Arg 

660 665 670 

Ser His Ser Ser Gly Ser Ser Ser Ser Ser Ser Ser Glu Ala Lys Asp 

675 680 685 

Arg Asp Ser Giy Arg Glu Gin Gly Pro Ser Arg Glu Pro His Pro Thr 
690 695 700 



<210> 2 
<211> 2112 
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<212> DNA 

<213> Homo sapiens 

<400> 2 

atggccaatg agaatcacgg cagcccccgg gaggaagcgt ccctgctgag tcactcccca 60 
ggtacctcca atcagagcca gccctgttct ccaaagccaa tccgcctggt tcaggacctc 120 
ccagaggagc tggtgcatgc aggctgggag aagtgctgga gccggaggga gaatcgtccc 180 
tactacttca accgattcac caaccagtcc ctgtgggaga tgcccgtgct ggggcagcac 240 
gatgtgattt cggacccttt ggggctgaat gcgaccccac tgccccaaga ctcaagcttg 300 
gtggaaactc ccccggctga gaacaagccc agaaagcggc agctctcgga agagcagcca 360 
agcggcaatg gtgtgaagaa gcccaagatt gaaatcccag tgacacccac aggccagtcg 420 
gtgcccagct cccccagtat cccaggaacc ccaacgctga agatgtgggg tacgtcccct 480 
gaagataaac agcaggcagc tctcctacga cccactgagg tctactggga cctggacatc 540 
cagaccaatg ctgtcatcaa gcaccggggg ccttcagagg tgctgccccc gcatcccgaa 600 
gtggaactgc tccgctctca gctcatcctg aagcttcggc agcactatcg ggagctgtgc 660 
cagcagcgag agggcattga gcctccacgg gagtctttca accgctggat gctggagcgc 720 
aaggtggtag acaaaggatc tgaccccctg ttgcccagca actgtgaacc agtcgtgtca 780 
ccttccatgt ttcgtgaaat catgaacgac attcctatca ggttatcccg aatcaagttc 840 
cgggaggaag ccaagcgcct gctctttaaa tatgcggagg ccgccaggcg gctcatcgag 900 
tccaggagtg catcccctga cagtaggaag gtggtcaaat ggaatgtgga agacaccttt 960 
agctggcttc ggaaggacca ctcagcctcc aaggaggact acatggatcg cctggagcat 1020 
ctgcggaggc agtgtggccc ccacgtctcg gccgcagcca aggactccgt ggaaggcatc 1080 
tgcagtaaga tctaccacat ctccctggag tacgtcaaac ggatccgaga gaagcacctt 1140 
gccatcctca aggaaaacaa catctcagag gaggtggagg cccctgaggt ggagccccgc 1200 
ctagtgtact gctacccagt ccggctggct gtgtctgcac cgcccatgcc cagcgtggag 1260 
atgcacatgg agaacaacgt ggtctgcatc cggtataagg gagagatggt caaggtcagc 1320 
cgcaactact tcagcaagct gtggctcctt taccgctaca gctgcattga tgactctgcc 1380 
tttgagaggt tcctgccccg ggtctggtgt cttctccgac ggtaccagat gatgttcggc 1440 
gtgggcctct acgaggggac tggcctgcag ggatcgctgc ctgtgcatgt ctttgaggcc 1500 
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ctccaccgac tctttggcgt cagcttcgag 
cgccagtact gttctgcctt ccccgacaca 
ctagactttg ctccactgag tggttcattt 
atggatgcca tggtctctca ctttgagaga 
ttcatcgtgt tcatccctga gtggcgggaa 
'cagagccgct tcaaacgcca ccagttgatc 
ggctcccagc acatctgcaa gaaggaggaa 
gtgctcttcc tacagaacga ccctggcttt 
caggagctga gtgctgccta ccggcagtca 
tcgtcctcct cggaggccaa ggaccgggac 
cctcacccca ct 



tgcttcgcct cacccctcaa ctgctacttc 1560 
gacggctact ttggctcccg cgggccctgc 1620 
gaggccaacc ctcccttctg cgaggagctc 1680 
ctgcttgaga gctcaccgga gcccctgtcc 1740 
cccccaacac cagcgctcac ccgcatggag 1800 
ctgcctgcct ttgagcatga gtaccgcagt 1860 
atgcactaca aggccgtcca caacacggct 1920 
gccaagtggg cgccgacgcc tgaacggctg 1980 
ggccgcagcc acagctctgg ttcttcctca 2040 
tcgggccgtg agcagggtcc tagccgcgag 2100 

2112 



<210> 3 
<211> 2669 
<212> ONA 
<213> Homo sapiens 

<220> 
<221> CDS 

<222> (292).. (2406) 
<400> 3 

acacaagatg gcggcagcgg cgctggggag ggcgaggcgg aggcggcaaa acgggcggtc 60 
gagcagaacg tgtagccgcg tcccctccag tccgctccgg gcagctgctg atgcaaggaa 120 
tcccctgggc tcccgtccac tccactgctg accagcccat tcgcctgtgc tgagtcttcc 180 
tgcaggcctt tccttgcctc tgtgggaccc tgtgggggtc catccggctg gagaagaaaa 240 
gcctctcatg ctaacgttgc agaccccaga gggtcctgtg tgggtgtgga g atg gcc 297 

Met Ala 
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03 



aat gag aat 
Asn Glu Asn 
5 

tec cca ggt 
-Ser Pro Gly 
20 

cgc ctg gtt 
Arg Leu Val 
35 

aag tgc tgg 
Lys Cys Trp 

acc aac cag 
Thr Asn Gin 

att teg gac 
lie Ser Asp 
85 

age ttg gtg 
Ser Leu Val 

100 
etc teg gaa 
Leu Ser Glu 
115 

gaa ate cca 
Glu Me Pro 



cac ggc 
His Gly 

acc tec 
Thr Ser 

cag gac 
Gin Asp 

age egg 
Ser Arg 
55 

tec ctg 
Ser Leu 
70 

cct ttg 
Pro Leu 

gaa act 
Glu Thr 

gag cag 
Glu Gin 



age ccc 
Ser Pro 

aat cag 
Asn Gin 
25 

etc cca 
Leu Pro 
40 

agg gag 
Arg Glu 

tgg gag 
Trp Glu 

ggg ctg 
Gly Leu 

ccc ccg 
Pro Pro 
105 
cca age 
Pro Ser 
120 

ccc aca 
Pro Thr 



egg gag gaa gcg 
Arg Glu Glu Ala 
10 

age cag ccc tgt 
Ser Gin Pro Cys 



gag gag 
Glu Glu 

aat cgt 
Asn Arg 

atg ccc 
Met Pro 
75 

aat gcg 
Asn Ala 
90 

get gag 
Ala Glu 



ctg gtg 
Leu Val 
45 

ccc tac 
Pro Tyr 
60 

gtg ctg 
Val Leu 

acc cca 
Thr Pro 

aac aag 
Asn Lys 



gtg aca 
Val Thr 
135 

ate cca gga acc cca acg ctg 



ggc aat ggt gtg 
Gly Asn Gly Val 
125 

ggc cag teg gtg 
Gly Gin Ser Val 
140 

aag atg tgg ggt 



tec ctg 
Ser Leu 
15 

tct cca 
Ser Pro 
30 

cat gca 
His Ala 

tac ttc 
Tyr Phe 

ggg cag 
Gly Gin 

ctg ccc 
Leu Pro 
95 

ccc aga 
Pro Arg 
110 

aag aag 
Lys Lys 



1 

ctg agt cac 
Leu Ser His 

aag cca ate 
Lys Pro Me 

ggc tgg gag 
Gly Trp Glu 
50 

aac cga ttc 
Asn Arg Phe 
65 

cac gat gtg 
His Asp Val 
80 

caa gac tea 
Gin Asp Ser 

aag egg cag 
Lys Arg Gin 



345 



ccc aag att 
Pro Lys Me 
130 

ccc age tec ccc agt 
Pro Ser Ser Pro Ser 
145 

acg tec cct gaa gat 



393 



441 



489 



537 



585 



633 



681 



729 



777 



8/13 

Me Pro Gly Thr Pro Thr Leu Lys Met Trp Gly Thr Ser Pro Glu Asp 

150 155 160 

aaa cag cag gca get etc eta cga ccc act gag gtc tac tgg gac ctg 825 
Lys Gin Gin Ala Ala Leu Leu Arg Pro Thr Glu Val Tyr Trp Asp Leu 

165 170 175 

gac ate cag acc aat get gtc ate aag cac egg ggg cct tea gag gtg 873 
Asp lie Gin Thr Asn Ala Val lie Lys His Arg Gly Pro Ser Glu Val 

180 185 190 

ctg ccc ccg cat ccc gaa gtg gaa ctg etc cgc tct cag etc ate ctg 921 
Leu Pro Pro His Pro Glu Val Glu Leu Leu Arg Ser Gin Leu lie Leu 
195 200 205 210 

aag ctt egg cag cac tat egg gag ctg tgc cag cag cga gag ggc att 969 
Lys Leu Arg Gin His Tyr Arg Glu Leu Cys Gin Gin Arg Glu Gly lie 

215 220 225 

gag cct cca egg gag tct ttc aac cgc tgg atg ctg gag cgc aag gtg 1017 
Glu Pro Pro Arg Glu Ser Phe Asn Arg Trp Met Leu Glu Arg Lys Val 

230 235 240 

gta gac aaa gga tct gac ccc ctg ttg ccc age aac tgt gaa cca gtc 1065 
Val Asp Lys Gly Ser Asp Pro Leu Leu Pro Ser Asn Cys Glu Pro Val 

245 250 255 

gtg tea cct tec atg ttt cgt gaa ate atg aac gac att cct ate agg 1113 
Val Ser Pro Ser Met Phe Arg Glu lie Met Asn Asp lie Pro lie Arg 

260 265 270 

tta tec cga ate aag ttc egg gag gaa gec aag cgc ctg etc ttt aaa 1161 
Leu Ser Arg lie Lys Phe Arg Glu Glu Ala Lys Arg Leu Leu Phe Lys 
275 280 285 290 

tat gcg gag gee gee agg egg etc ate gag tec agg agt gca tec cct 1209 
Tyr Ala Glu Ala Ala Arg Arg Leu lie Glu Ser Arg Ser Ala Ser Pro 
295 300 305 
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gac agt agg aag gtg gtc aaa tgg aat gtg gaa gac acc ttt age tgg 1257 
Asp Ser Arg Lys Val Val Lys Trp Asn Val Giu Asp Thr Phe Ser Trp 

310 315 320 

ctt egg aag gac cac tea gec tec aag gag gac tac atg gat cgc ctg 1305 
Leu Arg Lys Asp His Ser Ala Ser Lys Giu Asp Tyr Met Asp Arg Leu 

325 330 335 

gag cat ctg egg agg cag tgt ggc ccc cac gtc teg gee gca gee aag 1353 
Giu His Leu Arg Arg Gin Cys Giy Pro His Vai Ser Aia Ala Ala Lys 

340 345 350 

gac tec gtg gaa ggc ate tgc agt aag ate tac cac ate tec ctg gag 1401 
Asp Ser Val Giu Gly lie Cys Ser Lys He Tyr His lie Ser Leu Giu 
355 360 355 370 

tac gtc aaa egg ate cga gag aag cac ctt gec ate etc aag gaa aac 1449 
Tyr Val Lys Arg lie Arg Giu Lys His Leu Ala He Leu Lys Gtu Asn 

375 380 385 

aac ate tea gag gag gtg gag gee cct gag gtg gag ccc cgc eta gtg 1497 
Asn Me Ser Giu Giu Val Giu Aia Pro Giu Vai Giu Pro Arg Leu Val 

390 395 400 

tac tgc tac cca gtc egg ctg get gtg tct gca ccg ccc atg ccc age 1545 
Tyr Cys Tyr Pro Val Arg Leu Ala Val Ser Ala Pro Pro Met Pro Ser 

405 410 415 

gtg gag atg cac atg gag aac aac gtg gtc tgc ate egg tat aag gga 1593 
Vai Giu Met His Met Giu Asn Asn Va! Val Cys lie Arg Tyr Lys Giy 

420 425 430 

gag atg gtc aag gtc age cgc aac tac ttc age aag ctg tgg etc ctt 1641 
Giu Met Val Lys Val Ser Arg Asn Tyr Phe Ser Lys Leu Trp Leu Leu 
435 440 445 450 

tac cgc tac age tgc att gat gac tct gee ttt gag agg ttc ctg ccc 1689 
Tyr Arg Tyr Ser Cys lie Asp Asp Ser Ala Phe Giu Arg Phe Leu Pro 
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455 460 465 

egg gtc tgg tgt ctt etc cga egg tac cag atg atg ttc ggc gtg ggc 1737 
Arg Val Trp Cys Leu Leu Arg Arg Tyr Gin Met Met Phe Gly Val Gly 

470 475 480 

etc tac gag ggg act ggc ctg cag gga teg ctg cct gtg cat gtc ttt 1785 
Leu Tyr Glu Gly Thr Gly Leu Gin Gly Ser Leu Pro Val His Val Phe 

485 490 495 

gag gec etc cac cga etc ttt ggc gtc age ttc gag tgc ttc gec tea 1833 
Glu Ala Leu His Arg Leu Phe Gly Val Ser Phe Glu Cys Phe Ala Ser 

500 505 510 

ccc etc aac tgc tac ttc cgc cag tac tgt tct gec ttc ccc gac aca 1881 
Pro Leu Asn Cys Tyr Phe Arg Gin Tyr Cys Ser Ala Phe Pro Asp Thr 
515 520 525 530 

gac ggc tac ttt ggc tec cgc ggg ccc tgc eta gac ttt get cca ctg 1929 
Asp Gly Tyr Phe Gly Ser Arg Gly Pro Cys Leu Asp Phe Ala Pro Leu 

535 540 545 

agt ggt tea ttt gag gec aac cct ccc ttc tgc gag gag etc atg gat 1977 
Ser Gly Ser Phe Glu Ala Asn Pro Pro Phe Cys Glu Glu Leu Met Asp 

550 555 560 

gee atg gtc tct cac ttt gag aga ctg ctt gag age tea ccg gag ccc 2025 
Ala Met Val Ser His Phe Glu Arg Leu Leu Glu Ser Ser Pro Glu Pro 

565 570 575 

ctg tec ttc ate gtg ttc ate cct gag tgg egg gaa ccc cca aca cca 2073 
Leu Ser Phe lie Val Phe Me Pro Glu Trp Arg Glu Pro Pro Thr Pro 

580 585 590 

gcg etc acc cgc atg gag cag age cgc ttc aaa cgc cac cag ttg ate 2121 
Ala Leu Thr Arg Met Glu Gin Ser Arg Phe Lys Arg His Gin Leu lie 
595 600 605 610 

ctg cct gec ttt gag cat gag tac cgc agt ggc tec cag cac ate tgc 2169 
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Leu Pro Ala Phe Glu His Glu Tyr Arg Ser Gly Ser Gin His Me Cys 

615 620 625 

aag aag gag gaa atg cac tac aag gcc gtc cac aac acg get gtg etc 2217 
Lys Lys Glu Glu Met His Tyr Lys Ala Val His Asn Thr Ala Val Leu 

630 635 640 

ttc eta cag aac gac cct ggc ttt gcc aag tgg gcg ccg acg cct gaa 2265 
Phe Leu Gin Asn Asp Pro Gly Phe Ala Lys Trp Ala Pro Thr Pro Glu 

645 650 655 

egg ctg cag gag ctg agt get gcc tac egg cag tea ggc cgc age cac 2313 
Arg Leu Gin Glu Leu Ser Ala Ala Tyr Arg Gin Ser Gly Arg Ser His 

660 665 670 

age tct ggt tct tec tea teg tec tec teg gag gcc aag gac egg gac 2361 
Ser Ser Gly Ser Ser Ser Ser Ser Ser Ser Glu Ala Lys Asd Arg Asp 
675 680 685 690 

teg ggc cgt gag cag ggt cct age cgc gag cct cac ccc act taa 2406 
Ser Gly Arg Glu Gin Gly Pro Ser Arg Glu Pro His Pro Thr 

695 700 705 

catatcctgc ggggaggagg agccccaggg gtgetagtet ggactgctgg gactcgggcc 2466 
cctggggcct cagagggacc ccggctgcca ctgacatatg aagattatgg ttctgecagg 2526 
gctcccctcc ctgcctgtcc ccaagtcctc acctcaaact ccctccaagt cccatgtata 2586 
taggtcctga tgccttccca accccgcccc tcaccctgtt gccaccttgt ttcatttgta 2646 
aaaggaaata cagaaacccc ccc 2669 

<210> 4 
<211> 26 
<212> DNA 

<213> Artificial sequence 
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'sis? 



m 



<220> 

<213> Synthesized oligonucleotide 
<400> 4 

ccgaattcat ggccaatgag aatcac 



26 



<210> 5 
<211> 26 
<212> DNA 

<213> Artificial sequence 
<220> 

<213> Synthesized oligonucleotide 
<400> 5 

ccgtcgactt aagtggggtg aggctc 26 



<210> 6 
<211> 33 
<212> DNA 

<213> Artificial sequence 
<220> 

<213> Synthesized oligonucleotide 
<400> 6 

cgaggatccg ttcaggacct cccagaggacg eta 



<210> 7 
<211> 33 
<212> DNA 

<213> Artificial sequence 
<220> 

<213> Synthesized oligonucleotide 
<400> 7 

cgagaattcc gaaatcacat cgtgctgccc cag 



09/880722 



SEQUENCE LISTING 

<110> Japan Science and Technology Corporation 

<120> Human nucleoprotein having a WW domain and 
a polynucleotide encoding the protein 

<130> 09/889,722 

<140> PCT/JP00/08253 
<141> 2000-11-22 

<150> JP11-332572 
<151> 1999-11-24 

<160> 7 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 704 
<212> PRT 

<213> Homo sapiens 





<400> 1 






























Met 


Ala 


Asn 


Glu 


Asn 


His 


Gly 


Ser 


Pro 


Arg 


Glu 


Glu 


Ala 


Ser Leu 


Leu 




1 








5 










10 








15 






Ser 


His 


Ser 


Pro 


Gly 


Thr 


Ser 


Asn 


Gin 


Ser 


Gin 


Pro 


Cys 


Ser Pro 


Lys 










20 










25 










30 






Pro 


He 


Arg 


Leu 


Val 


Gin 


Asp 


Leu 


Pro 


Glu 


Glu 


Leu 


Val 


His Ala 


Gly 






35 










40 










45 






0 


Trp 


Glu 


Lys 


Cys 


Trp 


Ser 


Arg 


Arg 


Glu 


Asn 


Arg 


Pro 


Tyr 


Tyr Phe 


Asn 






50 










55 










60 










Arg 


Phe 


Thr 


Asn 


Gin 


Ser 


Leu 


Trp 


Glu 


Met 


Pro 


Val 


Leu 


Gly Gin His 




65 










70 










75 








80 




Asp 


Val 


He 


Ser 


Asp 
85 


Pro 


Leu 


Gly 


Leu 


Asn 
90 


Ala 


Thr 


Pro 


Leu Pro 
95 


Gin 




Asp 


Ser 


Ser 


Leu 
100 


Val 


Glu 


Thr 


Pro 


Pro 
105 


Ala 


Glu 


Asn 


Lys 


Pro Arg 
110 


Lys 




Arg 


Gin 


Leu 
115 


Ser 


Glu 


Glu 


Gin 


Pro 
120 


Ser 


Gly 


Asn 


Gly 


Val 
125 


Lys Lys 


Pro 




Lys 


He 
130 


Glu 


He 


Pro 


Val 


Thr 
135 


Pro 


Thr 


Gly 


Gin 


Ser 
140 


Val 


Pro Ser 


Ser 




Pro 


Ser 


He 


Pro 


Gly 


Thr 


Pro 


Thr 


Leu 


Lys 


Met 


Trp 


Gly 


Thr Ser 


Pro 




145 










150 










155 








160 




Glu 


Asp 


Lys 


Gin 


Gin 
165 


Ala 


Ala 


Leu 


Leu 


Arg 
170 


Pro 


Thr 


Glu 


Val Tyr 
175 


Trp 




Asp 


Leu 


Asp 


He 


Gin 


Thr 


Asn 


Ala 


Val 


He 


Lys 


His 


Arg Gly Pro 


Ser 










180 










185 










190 






Glu 


Val 


Leu 


Pro 


Pro 


His 


Pro 


Glu 


Val 


Glu 


Leu 


Leu 


Arg 


Ser Gin 


Leu 








195 










200 










205 








He 


Leu 

210 


Lys 


Leu 


Arg 


Gin 


His 
215 


Tyr 


Arg 


Glu 


Leu 


Cys 
220 


Gin 


Gin Arg 


Glu 




Gly 


He 


Glu 


Pro 


Pro 


Arg 


Glu 


Ser 


Phe 


Asn 


Arg 


Trp 


Met 


Leu Glu Arg 




225 










230 










235 








240 




Lys 


Val 


Val 


Asp 


Lys 


Gly 


Ser 


Asp 


Pro 


Leu 


Leu 


Pro 


Ser 


Asn Cys 


Glu 



1 
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245 250 255 

Pro Val Val Ser Pro Ser Met Phe Arg Glu He Met Asn Asp He Pro 

260 265 270 

He Arg Leu Ser Arg He Lys Phe Arg Glu Glu Ala Lys Arg Leu Leu 

275 280 285 

Phe Lys Tyr Ala Glu Ala Ala Arg Arg Leu He Glu Ser Arg Ser Ala 

290 295 300 

Ser Pro Asp Ser Arg Lys Val Val Lys Trp Asn Val Glu Asp Thr Phe 
305 310 315 320 

Ser Trp Leu Arg Lys Asp His Ser Ala Ser Lys Glu Asp Tyr Met Asp 

325 330 335 

Arg Leu Glu His Leu Arg Arg Gin Cys Gly Pro His Val Ser Ala Ala 

340 345 350 

Ala Lys Asp Ser Val Glu Gly He Cys Ser Lys He Tyr His He Ser 

355 360 365 

Leu Glu Tyr Val Lys Arg He Arg Glu Lys His Leu Ala He Leu Lys 

370 375 380 

Glu Asn Asn He Ser Glu Glu Val Glu Ala Pro Glu Val Glu Pro Arg 
385 390 395 400 

Leu Val Tyr Cys Tyr Pro Val Arg Leu Ala Val Ser Ala Pro Pro Met 

405 410 415 

Pro Ser Val Glu Met His Met Glu Asn Asn Val Val Cys He Arg Tyr 

420 425 430 

Lys Gly Glu Met Val Lys Val Ser Arg Asn Tyr Phe Ser Lys Leu Trp 

435 440 445 

Leu Leu Tyr Arg Tyr Ser Cys He Asp Asp Ser Ala Phe Glu Arg Phe 

450 455 460 

Leu Pro Arg Val Trp Cys Leu Leu Arg Arg Tyr Gin Met Met Phe Gly 
465 470 475 480 

Val Gly Leu Tyr Glu Gly Thr Gly Leu Gin Gly Ser Leu Pro Val His 

485 490 495 

Val Phe Glu Ala Leu His Arg Leu Phe Gly Val Ser Phe Glu Cys Phe 

500 505 510 

Ala Ser Pro Leu Asn Cys Tyr Phe Arg Gin Tyr Cys Ser Ala Phe Pro 

515 520 525 

Asp Thr Asp Gly Tyr Phe Gly Ser Arg Gly Pro Cys Leu Asp Phe Ala 

530 535 540 

Pro Leu Ser Gly Ser Phe Glu Ala Asn Pro Pro Phe Cys Glu Glu Leu 
545 550 555 560 

Met Asp Ala Met Val Ser His Phe Glu Arg Leu Leu Glu Ser Ser Pro 

565 570 575 

Glu Pro Leu Ser Phe He Val Phe He Pro Glu Trp Arg Glu Pro Pro 

580 585 590 

Thr Pro Ala Leu Thr Arg Met Glu Gin Ser Arg Phe Lys Arg His Gin 

595 600 605 

Leu He Leu Pro Ala Phe Glu His Glu Tyr Arg Ser Gly Ser Gin His 

610 615 620 

He Cys Lys Lys Glu Glu Met His Tyr Lys Ala Val His Asn Thr Ala 
625 630 635 640 

Val Leu Phe Leu Gin Asn Asp Pro Gly Phe Ala Lys Trp Ala Pro Thr 

645 650 655 

Pro Glu Arg Leu Gin Glu Leu Ser Ala Ala Tyr Arg Gin Ser Gly Arg 

660 665 670 

Ser His Ser Ser Gly Ser Ser Ser Ser Ser Ser Ser Glu Ala Lys Asp 

675 680 685 

Arg Asp Ser Gly Arg Glu Gin Gly Pro Ser Arg Glu Pro His Pro Thr 
690 695 700 
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<210> 2 

<211> 2112 

<212> DNA 

<213> Homo sapiens 

<400> 2 

atggccaatg agaatcacgg cagcccccgg 
ggtacctcca atcagagcca gccctgttct 
ccagaggagc tggtgcatgc aggctgggag 
tactacttca accgattcac caaccagtcc 
gatgtgattt cggacccttt ggggctgaat 
gtggaaactc ccccggctga gaacaagccc 
agcggcaatg gtgtgaagaa gcccaagatt 
gtgcccagct cccccagtat cccaggaacc 
gaagataaac agcaggcagc tctcctacga 
cagaccaatg ctgtcatcaa gcaccggggg 
gtggaactgc tccgctctca gctcatcctg 
cagcagcgag agggcattga gcctccacgg 
aaggtggtag acaaaggatc tgaccccctg 
ccttccatgt ttcgtgaaat catgaacgac 
cgggaggaag ccaagcgcct gctctttaaa 
tccaggagtg catcccctga cagtaggaag 
agctggcttc ggaaggacca ctcagcctcc 
ctgcggaggc agtgtggccc ccacgtctcg 
tgcagtaaga tctaccacat ctccctggag 
gccatcctca aggaaaacaa catctcagag 
ctagtgtact gctacccagt ccggctggct 
atgcacatgg agaacaacgt ggtctgcatc 
cgcaactact tcagcaagct gtggctcctt 
tttgagaggt tcctgccccg ggtctggtgt 
gtgggcctct acgaggggac tggcctgcag 
ctccaccgac tctttggcgt cagcttcgag 
cgccagtact gttctgcctt ccccgacaca 
ctagactttg ctccactgag tggttcattt 
atggatgcca tggtctctca ctttgagaga 
ttcatcgtgt tcatccctga gtggcgggaa 
cagagccgct tcaaacgcca ccagttgatc 
ggctcccagc acatctgcaa gaaggaggaa 
gtgctcttcc tacagaacga ccctggcttt 
caggagctga gtgctgccta ccggcagtca 
tcgtcctcct cggaggccaa ggaccgggac 
cctcacccca ct 



<210> 3 

<211> 2669 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (292) . • (2406) 
<400> 3 

acacaagatg gcggcagcgg cgctggggag 



gaggaagcgt ccctgctgag tcactcccca 60 
ccaaagccaa tccgcctggt tcaggacctc 120 
aagtgctgga gccggaggga gaatcgtccc 180 
ctgtgggaga tgcccgtgct ggggcagcac 240 
gcgaccccac tgccccaaga ctcaagcttg 300 
agaaagcggc agctctcgga agagcagcca 360 
gaaatcccag tgacacccac aggccagtcg 420 
ccaacgctga agatgtgggg tacgtcccct 480 
cccactgagg tctactggga cctggacatc 540 
ccttcagagg tgctgccccc gcatcccgaa 600 
aagcttcggc agcactatcg ggagctgtgc 660 
gagtctttca accgctggat gctggagcgc 720 
ttgcccagca actgtgaacc agtcgtgtca 780 
attcctatca ggttatcccg aatcaagttc 840 
tatgcggagg ccgccaggcg gctcatcgag 900 
gtggtcaaat ggaatgtgga agacaccttt 960 
aaggaggact acatggatcg cctggagcat 1020 
gccgcagcca aggactccgt ggaaggcatc 1080 
tacgtcaaac ggatccgaga gaagcacctt 1140 
gaggtggagg cccctgaggt ggagccccgc 1200 
gtgtctgcac cgcccatgcc cagcgtggag 1260 
cggtataagg gagagatggt caaggtcagc 1320 
taccgctaca gctgcattga tgactctgcc 1380 
cttctccgac ggtaccagat gatgttcggc 1440 
ggatcgctgc ctgtgcatgt ctttgaggcc 1500 
tgcttcgcct cacccctcaa ctgctacttc 1560 
gacggctact ttggctcccg cgggccctgc 1620 
gaggccaacc ctcccttctg cgaggagctc 1680 
ctgcttgaga gctcaccgga gcccctgtcc 1740 
cccccaacac cagcgctcac ccgcatggag 1800 
ctgcctgcct ttgagcatga gtaccgcagt 1860 
atgcactaca aggccgtcca caacacggct 1920 
gccaagtggg cgccgacgcc tgaacggctg 1980 
ggccgcagcc acagctctgg ttcttcctca 2040 
tcgggccgtg agcagggtcc tagccgcgag 2100 

2112 



ggcgaggcgg aggcggcaaa acgggcggtc 60 



3 
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gagcagaacg tgtagccgcg tcccctccag tccgctccgg 
tcccctgggc tcccgtccac tccactgctg accagcccat 
tgcaggcctt tccttgcctc tgtgggaccc tgtgggggtc 
gcctctcatg ctaacgttgc agaccccaga gggtcctgtg 



aat 
Asn 

tec 
Ser 

cgc 
Arg 
35 
aag 
Lys 

acc 
Thr 

att 
He 

age 
Ser 

etc 
Leu 
115 
gaa 
Glu 

ate 
He 

aaa 
Lys 

gac 
Asp 

ctg 
Leu 
195 
aag 
Lys 

gag 
Glu 

gta 
Val 

gtg 
Val 



gag 
Glu 

cca 
Pro 
20 
ctg 
Leu 

tgc 
Cys 

aac 
Asn 

teg 
Ser 

ttg 
Leu 
100 
teg 
Ser 

ate 
He 

cca 
Pro 

cag 
Gin 

ate 
He 
180 
ccc 
Pro 

ctt 
Leu 

cct 
Pro 

gac 
Asp 

tea 
Ser 
260 



aat cac ggc age 
Asn His Gly Ser 
5 

ggt acc tec aat 
Gly Thr Ser Asn 



gtt cag 
Val Gin 

tgg age 
Trp Ser 

cag tec 
Gin Ser 
70 

gac cct 
Asp Pro 
85 

gtg gaa 
Val Glu 



gac etc 
Asp Leu 
40 

egg agg 
Arg Arg 

55 
ctg tgg 
Leu Trp 

ttg ggg 
Leu Gly 

act ccc 
Thr Pro 



gaa 
Glu 

cca 
Pro 

gga 
Gly 

cag 
Gin 
165 
cag 
Gin 



gag cag 
Glu Gin 



gtg 
Val 

acc 
Thr 
150 
gca 
Ala 



aca 
Thr 
135 
cca 
Pro 

get 
Ala 



cca 
Pro 
120 
ccc 
Pro 

acg 
Thr 

etc 
Leu 



acc aat get 
Thr Asn Ala 



ccg cat 
Pro His 

egg cag 
Arg Gin 



cca 
Pro 

aaa 
Lys 
245 
cct 
Pro 



egg 
Arg 
230 
gga 
Gly 

tec 
Ser 



ccc 
Pro 

cac 
His 
215 
gag 
Glu 



gaa 
Glu 
200 
tat 
Tyr 

tct 
Ser 



tct gac 
Ser Asp 

atg ttt 
Met Phe 



ccc 
Pro 

cag 
Gin 
25 
cca 
Pro 

gag 
Glu 

gag 
Glu 

ctg 
Leu 

ccg 
Pro 
105 
age 
Ser 

aca 
Thr 

ctg 
Leu 

eta 
Leu 

gtc 
Val 
185 
gtg 
Val 

egg 
Arg 

ttc 
Phe 

ccc 
Pro 

cgt 
Arg 
265 



egg gag gaa gcg 
Arg Glu Glu Ala 
10 

age cag ccc tgt 
Ser Gin Pro Cys 



gag gag ctg 
Glu Glu Leu 



aat 
Asn 

atg 
Met 

aat 
Asn 
90 
get 
Ala 



cgt ccc 
Arg Pro 
60 

ccc gtg 
Pro Val 

75 
gcg acc 
Ala Thr 

gag aac 
Glu Asn 



gtg 
Val 
45 
tac 
Tyr 

ctg 
Leu 

cca 
Pro 

aag 
Lys 



ggc aat 
Gly Asn 

ggc cag 
Gly Gin 

aag atg 
Lys Met 
155 
cga ccc 
Arg Pro 
170 

ate aag 
He Lys 

gaa ctg 
Glu Leu 

gag ctg 
Glu Leu 



aac 
Asn 

ctg 
Leu 
250 
gaa 
Glu 



cgc 
Arg 
235 
ttg 
Leu 

ate 
He 



ggt gtg 
Gly Val 
125 
teg gtg 
Ser Val 
140 

tgg ggt 

Trp Gly 

act gag 
Thr Glu 

cac egg 
His Arg 

etc cgc 
Leu Arg 
205 
tgc cag 
Cys Gin 
220 

tgg atg 
Trp Met 

ccc age 
Pro Ser 

atg aac 
Met Asn 



geagctgetg atgcaaggaa 120 
tcgcctgtgc tgagtcttcc 180 
catccggctg gagaagaaaa 240 
tgggtgtgga g atg gee 297 
Met Ala 
1 

tec ctg ctg agt cac 345 
Ser Leu Leu Ser His 
15 

tct cca aag cca ate 393 
Ser Pro Lys Pro He 
30 

cat gca ggc tgg gag 441 
His Ala Gly Trp Glu 
50 

tac ttc aac cga ttc 489 
Tyr Phe Asn Arg Phe 
65 

ggg cag cac gat gtg 537 
Gly Gin His Asp Val 
80 

ctg ccc caa gac tea 585 
Leu Pro Gin Asp Ser 
95 

ccc aga aag egg cag 633 

Pro Arg Lys Arg Gin 

110 

aag aag ccc aag att 681 
Lys Lys Pro Lys He 
130 

ccc age tec ccc agt 729 
Pro Ser Ser Pro Ser 
145 

acg tec cct gaa gat 777 
Thr Ser Pro Glu Asp 
160 

gtc tac tgg gac ctg 825 
Val Tyr Trp Asp Leu 
175 

ggg cct tea gag gtg 873 
Gly Pro Ser Glu Val 
190 

tct cag etc ate ctg 921 
Ser Gin Leu He Leu 
210 

cag cga gag ggc att 969 
Gin Arg Glu Gly He 
225 

ctg gag cgc aag gtg 1017 
Leu Glu Arg Lys Val 
240 

aac tgt gaa cca gtc 1065 
Asn Cys Glu Pro Val 
255 

gac att cct ate agg 1113 
Asp He Pro He Arg 
270 



4 
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1161 



1209 



1257 



1305 



1401 



1449 



1497 



tta tec cga ate aag ttc egg gag gaa gee aag cgc ctg etc ttt aaa 
Leu Ser Arg He Lys Phe Arg Glu Glu Ala Lys Arg Leu Leu Phe Lys 
275 280 285 290 

tat gcg gag gec gec agg egg etc ate gag tec agg agt gca tec cct 
Tyr Ala Glu Ala Ala Arg Arg Leu He Glu Ser Arg Ser Ala Ser Pro 

295 300 305 

gac agt agg aag gtg gtc aaa tgg aat gtg gaa gac ace ttt age tgg 
Asp Ser Arg Lys Val Val Lys Trp Asn Val Glu Asp Thr Phe Ser Trp 

310 315 320 

ctt egg aag gac cac tea gee tec aag gag gac tac atg gat cgc ctg 
Leu Arg Lys Asp His Ser Ala Ser Lys Glu Asp Tyr Met Asp Arg Leu 

325 330 335 

gag cat ctg egg agg cag tgt ggc ccc cac gtc teg gee gca gec aag 1353 
Glu His Leu Arg Arg Gin Cys Gly Pro His Val Ser Ala Ala Ala Lys 

340 345 350 

gac tec gtg gaa ggc ate tgc agt aag ate tac cac ate tec ctg gag 
Asp Ser Val Glu Gly He Cys Ser Lys He Tyr His He Ser Leu Glu 
355 360 365 370 

tac gtc aaa egg ate cga gag aag cac ctt gee ate etc aag gaa aac 
Tvr Val Lys Arg He Arg Glu Lys His Leu Ala He Leu Lys Glu Asn 

375 380 385 

aac ate tea gag gag gtg gag gec cct gag gtg gag ccc cgc eta gtg 
Asn He Ser Glu Glu Val Glu Ala Pro Glu Val Glu Pro Arg Leu Val 

390 395 400 

tac tgc tac cca gtc egg ctg get gtg tct gca ccg ccc atg ccc age 1545 
Tyr Cys Tyr Pro Val Arg Leu Ala Val Ser Ala Pro Pro Met Pro Ser 

405 410 415 

gtg gag atg cac atg gag aac aac gtg gtc tgc ate egg tat aag gga 1593 
Val Glu Met His Met Glu Asn Asn Val Val Cys He Arg Tyr Lys Gly 

420 425 430 

gag atg gtc aag gtc age cgc aac tac ttc age aag ctg tgg etc ctt 1641 
Glu Met Val Lys Val Ser Arg Asn Tyr Phe Ser Lys Leu Trp Leu Leu 
435 440 445 450 

tac cgc tac age tgc att gat gac tct gec ttt gag agg ttc ctg ccc 
Tyr Arg Tyr Ser Cys He Asp Asp Ser Ala Phe Glu Arg Phe Leu Pro 

455 460 465 

egg gtc tgg tgt ctt etc cga egg tac cag atg atg ttc ggc gtg ggc 
Arg Val Trp Cys Leu Leu Arg Arg Tyr Gin Met Met Phe Gly Val Gly 

470 475 480 

etc tac gag ggg act ggc ctg cag gga teg ctg cct gtg cat gtc ttt 
Leu Tyr Glu Gly Thr Gly Leu Gin Gly Ser Leu Pro Val His Val Phe 

485 490 495 

gag gee etc cac cga etc ttt ggc gtc age ttc gag tgc ttc gec tea 1833 
Glu Ala Leu His Arg Leu Phe Gly Val Ser Phe Glu Cys Phe Ala Ser 

500 505 510 

ccc etc aac tgc tac ttc cgc cag tac tgt tct gee ttc ccc gac aca 
Pro Leu Asn Cys Tyr Phe Arg Gin Tyr Cys Ser Ala Phe Pro Asp Thr 
515 520 525 530 

gac ggc tac ttt ggc tec cgc ggg ccc tgc eta gac ttt get cca ctg 
Asp Gly Tyr Phe Gly Ser Arg Gly Pro Cys Leu Asp Phe Ala Pro Leu 

535 540 545 

agt ggt tea ttt gag gee aac cct ccc ttc tgc gag gag etc atg gat 
Ser Gly Ser Phe Glu Ala Asn Pro Pro Phe Cys Glu Glu Leu Met Asp 

550 555 560 

gee atg gtc tct cac ttt gag aga ctg ctt gag age tea ccg gag ccc 
Ala Met Val Ser His Phe Glu Arg Leu Leu Glu Ser Ser Pro Glu Pro 
565 570 575 



1689 



1737 



1785 



1881 



1929 



1977 



2025 
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ctg tec ttc ate gtg ttc ate cct gag tgg egg 
Leu Ser Phe lie Val Phe lie Pro Glu Trp Arg 

580 585 
gcg etc acc cgc atg gag cag age cgc ttc aaa 
Ala Leu Thr Arg Met Glu Gin Ser Arg Phe Lys 
595 600 605 

ctg cct gec ttt gag cat gag tac cgc agt ggc 
Leu Pro Ala Phe Glu His Glu Tyr Arg Ser Gly 

615 620 
aag aag gag gaa atg cac tac aag gec gtc cac 
Lys Lys Glu Glu Met His Tyr Lys Ala Val His 

630 635 
ttc eta cag aac gac cct ggc ttt gec aag tgg 
Phe Leu Gin Asn Asp Pro Gly Phe Ala Lys Trp 

645 650 
egg ctg cag gag ctg agt get gec tac egg cag 
Arg Leu Gin Glu Leu Ser Ala Ala Tyr Arg Gin 

660 665 
age tct ggt tct tec tea teg tec tec teg gag 
Ser Ser Gly Ser Ser Ser Ser Ser Ser Ser Glu 
675 680 685 

teg ggc cgt gag cag ggt cct age cgc gag cct 
Ser Gly Arg Glu Gin Gly Pro Ser Arg Glu Pro 

695 700 
catatcctgc ggggaggagg agccccaggg gtgetagtet 
cctggggcct cagagggacc ccggctgcca ctgacatatg 
gctcccctcc ctgcctgtcc ccaagtcctc acctcaaact 
taggtcctga tgccttccca accccgcccc tcaccctgtt 
aaaggaaata cagaaacccc ccc 



gaa 
Glu 
590 
cgc 
Arg 

tec 
Ser 

aac 
Asn 

gcg 
Ala 

tea 
Ser 
670 
gee 
Ala 

cac 
His 



ccc cca aca cca 
Pro Pro Thr Pro 

cac cag ttg ate 
His Gin Leu lie 
610 

cag cac ate tgc 
Gin His lie Cys 
625 

acg get gtg etc 
Thr Ala Val Leu 
640 

ccg acg cct gaa 
Pro Thr Pro Glu 
655 

ggc cgc age cac 
Gly Arg Ser His 

aag gac egg gac 
Lys Asp Arg Asp 
690 

ccc act taa 
Pro Thr 



ggactgctgg gactcgggcc 
aagattatgg ttctgecagg 
ccctccaagt cccatgtata 
gccaccttgt ttcatttgta 



2073 



2121 



2169 



2217 



2265 



2313 



2361 



2406 



2466 
2526 
2586 
2646 
2669 



<210> 4 
<211> 26 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthesized oligonucleotide 



<400> 4 

ccgaattcat ggecaatgag aatcac 
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<210> 5 
<211> 26 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthesized oligonucleotide 



<400> 5 

ccgtcgactt aagtggggtg aggctc 



26 



<210> 6 
<211> 34 



<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthesized oligonucleotide 
<400> 6 

cgaggatccg ttcaggacct cccagaggac gcta 



<210> 7 
<211> 33 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthesized oligonucleotide 
<400> 7 

cgagaattcc gaaatcacat cgtgctgccc cag 



